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Abstract 

In this paper we consider the problem of monitoring detecting separation of agents from a base station in robotic 
and sensor networks. Such separation can be caused by mobility and/or failure of the agents. While separation/cut 
detection may be performed by passing messages between a node and the base in static networks, such a solution is 
impractical for networks with high mobility, since routes are constantly changing. We propose a distributed algorithm 
to detect separation from the base station. The algorithm consists of an averaging scheme in which every node updates 
a scalar state by communicating with its current neighbors. We prove that if a node is permanently disconnected from 
the base station, its state converges to 0. If a node is connected to the base station in an average sense, even if not 
connected in any instant, then we show that the expected value of its state converges to a positive number Therefore, a 
node can detect if it has been separated from the base station by monitoring its state. The effectiveness of the proposed 
algorithm is demonstrated through simulations, a real system implementation and experiments involving both static as 
well as mobile networks. 

Keywords: mobile ad-hoc network, robotic network, sensor network, fault detection 
1. Introduction 

Sensor and robotic networks is a quickly developing area extending the boundaries of traditional robotics and 
usual sensor networks 1 ll ^. In such a network, static as well as mobile nodes (robots) with varying levels of sensing, 
communication and actuation capability are used to observe, monitor, and control the state of physical processes. For 
example, in a scenario depicted in Figure [1] a team of ground robots may serve as information aggregators from a 
large number of static sensors deployed in an area as well as relays to send the processed data to more maneuverable 
autonomous aerial vehicles. We refer to all the devices that take part in sharing information, whether static or mobile 
as nodes or agents. Thus in Figure[T]the agents are the chopper, the mobile ground vehicles and the static sensors. 

The communication topology of a robotic and sensor network is likely to change over time. Such changes can 
occur not only due to the mobility of the robotic nodes, but are also likely with static agents due to failures. An agent 
may fail due to various factors such as mechanical/electrical problems, environmental degradation, battery depletion, 
or hostile tampering. These causes are especially common for networks deployed in harsh and dangerous situations 
for applications such as forest fire monitoring, battlefield or emergency response operations ( • 
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Figure 1: A heterogeneous robotic sensor network: ground robots aggregating information collected from a large number of static sensor nodes and 
relaying to an aerial robot. 



Information exchange through wireless communication is a crucial ingredient for a sensor and robotic network 
system to carry out the tasks it is deployed for Long range information exchange between agents is usually achieved 
by multi-hop wireless communication. In mobile networks, it is quite possible that the agents get separated into two or 
more sub-networks with no paths for data routing among these sub-networks. We say that a cut has occurred in such 
an event. A node that is disconnected from the base station at some time may also get reconnected later, e.g., when a 
mobile node moves in such a way that it restores connectivity between two disconnected sets of agents. In a robotic 
and sensor network, cuts can occur and disappear due to a combination of node mobility and node failure. 

Multi-hop transmission typically requires routing data from a source node to a sink node, which is usually the 
base station. In a network with highly dynamic topology - common for sensor and robotic networks - maintaining 
updated routing tables, or discovering routing paths on demand, are challenging and energy inefficient tasks 
such situations, sometimes information transfer from a node to a base station is accomphshed by waiting till the node 
comes within range of the base station or to another node that is close to the base station 151]. In either scenario, it 
is imperative for the agents to know if a cut occurs, so necessary action can be taken. For example, once a cut is 
detected, mobile robots can attempt to heal the network by repositioning themselves or placing new communication 
relay nodes in critical regions. There are other advantages of an ability to detect separation from the base. If a node 
that is separated from the base station initiates data transmission to the base station, it will only lead to wastage of 
precious on-board energy of itself and its nearby nodes that attempt to route the packets to the base station, since no 
path to the destination exists. Therefore, after a cut occurs it is better for such nodes not to initiate any long-long 
information transfer This requires the nodes to be able to monitor their connectivity to the base station. In addition, 
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any proposed solution for separation detection cannot rely on centralized information processing, since separation will 
prevent information from reaching the central node. 

A cut is defined for static networks as the separation of the network into two or more disjoint components 
However, for mobile networks we need to distinguish between a node getting disconnected from the rest of the nodes 
temporarily, for a short time interval, from getting disconnected for a very long time interval, or in the extreme case, 
all future time. A node may get disconnected temporarily due to mobility, or due to the failure of certain nodes, and 
then reconnected later. The more dynamic the topology is, the more likely that disconnections and re-connections will 
occur frequently. Therefore, what is needed is an ability of the nodes to detect if it has been disconnected from the 
source for a long enough time that will seriously hamper its ability to send data to the base station if the need arises. 
If the disconnection is for a very short period, that is not a serious cause for concern as the node can simply wait for a 
brief period to see if it gets connected again, which may occur due to the motion of itself or other nodes. We refer to 
the first as intermittent disconnection while the second is called permanent disconnection. The qualifier "permanent" 
is qualitative, it merely means "long enough" to necessitate some action on the part of the node as discussed earUer. 

However, little attention has been paid to the problem of detecting cuts. Notable exceptions are Kleinberg et 
al. 1711 - who study the problem of detecting network failures in wired networks - and Shrivastava et al. lal and 



Barooah |18|], who propose algorithms for detecting cuts in wireless sensor networks. The problem of detecting cuts 
in mobile networks has attracted even less attention. The solutions proposed in 1^ [l^l require routing packets from 
the base station to each node periodically. When a node fails to receive this packet for a certain amount of time, 
it suspects that a cut has occurred. This approach, however, requires routing data between far away nodes, which 
is challenging in networks of mobile nodes. While algorithms for coordina ting the motion of the agents to ensure 
network connectivity has been developed in recent times (see, for example, jl ill ), such algorithms cannot guarantee 
network connectivity under all circumstances. This is especially true when the robotic agents are operating in harsh 
and uncertain environments. Thus, there is a need to develop distributed algorithms for cut detection in robotic and 
sensor networks. 

In this paper we describe a simple algorithm for cut detection in robotic and sensor networks. The algorithm 
is applicable to networks made up of static or mobile agents, or a combination of the two. This algorithm - called 
Distributed Source Separation Detection (DSSD) algorithm - is designed to allow every node to monitor if it is con- 
nected to a specially designated node, the so-called source node, or if it has been disconnected from the source node. 
The source node is usually a base station. The reason for this terminology comes from an electrical analogy that the 
algorithm is based on. The idea is quite simple: imagine the wireless communication network as an electrical circuit 
where current is injected at the source node. When a cut separates certain nodes from the source node, the potential 
at these nodes becomes zero. If a node is connected to the source node through multi-hop paths, either always or in 
a time-average sense, the potential is positive in a time-average sense. In the DSSD algorithm, every nodes updates a 
scalar (called its state) which is an estimate of its virtual potential in the fictitious electrical network. The state update 
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is performed by a distributed algorithm that only requires a node to communicate to its neighbors. By monitoring their 
computed estimates of the potential, nodes can detect if they are connected to the source node or not. In addition, 
the iterative scheme used in computing the node potentials is extremely fast, which makes the algorithm scalable to 
large networks. The proposed DSSD algorithm is fully distributed and asynchronous. It requires communication only 
between neighbors, multi-hop routing between node pairs that are not neighbors is not needed. 

Performance of the algorithm in networks of static nodes was examined through simulations previously in Jsl. 
Here we extend the algorithm to both static and mobile networks. In the mobile case, by modeling the evolution of 
the network over time as a Markov chain, we show that the node states converge to positive numbers in an expected 
sense as long a path exists in a time-average sense between the node and the base station. The performance of the 
algorithm has been tested in simulations as well as in an experimental platform with mobile robots and human agents. 
These tests demonstrate the effectiveness of the algorithm in detecting separation (and reconnection) of nodes in both 
static and mobile networks. Since there is no existing prior work on the problem of detecting separation in mobile 
networks that can operate without multi-hop routing, we donot present comparison of the proposed algorithm with 
existing algorithms for cut detection. 

The rest of the paper is organized as follows. In Section|2]we introduce the DSSD algorithm. The rationale behind 
the algorithm and its theoretical properties are described in Section[3] Sections|4]and|5]describe results from computer 
simulations and experimental evaluations. The paper concludes with a summary in Section|6] 



2. The Distributed Source Separation Detection (DSSD) Algorithm 

We introduce some terminology about graphs that will be needed to describe the algorithm and the results precisely 
(see for example the excellent treatise by Diestel jl^l)- Given a set = {vi, . . . , v„,) of m elements referred to as 
vertices, and a set £ = {(v,-, v^) | v,-, vj € "V) of edges, a graph Q is defined as the pair (^V, £). A sensor network is 
modeled as a graph Q - CV, £) whose vertex set corresponds to the wireless sensor nodes and whose edges S 
describe direct communication between nodes. The size of a graph § is the number of its vertices ["VI. The graphs 
formed by the nodes of the sensor and robotic network are assumed to be undirected, i.e. the communication between 
two nodes is assumed to be symmetric. In the language of graph theory, the edges of an undirected graph are unordered 
pairs with the symmetry relation (v,-, vj) - (vj, vi). The neighbors of vertex v, is the set Ni of vertices connected to v; 
through an edge, i.e. Ni - {v';|(v,, vj) e £). The number of neighbors of a vertex \Ni\ is called its degree. A graph is 
called connected if for every pair of its vertices there exists a sequence of edges connecting them. 

In a mobile sensor and robotic networks the neighbor relationship can change with time, so the graph in our study 
are in general time varying: Qik) - {'V,S(k), where A; = 0, 1, ... is a discrete time index. Note that we assume the 
set of nodes does not change over time; though certain nodes may fail permanently at some time and thereafter not 
take part in the operation of the network. 
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In the proposed scheme, every node v, maintains a scalar variable Xi(k) in its local memory, which is called its 
state. The base station is designated as the source node, though in principle any node can be the source node. The 
reason for this terminology will be explained soon. For ease of description (and without loss of generality), the index 
of the source node is taken as 1 . The DSSD algorithm is an iterative process consisting of two phases at every discrete 
step: (i) State Update and (ii) Cut Detection from State. 

DSSD PHASE I (State update law): The scalar state Xi{k) assigned to node v,- is iteratively updated (starting with 
Xi (0) = 0) according to the following update rule. Recall that the index i = 1 corresponds to the source node. 



Xi(k + 1) 



1 



di(k)+l 



\ 



i= 1 

(1) 



where Niik) is the set of neighbors of v, in graph 0{k) and diik) := \Niik)\ is the degree of v, (number of neighbors) at 
time k, and the i > is an arbitrary fixed positive number that is called the source strength. The source strength is a 
design parameter and has to be provided to the source node a-priori. 

DSSD PHASE II (Cut detection from state): Every node v; maintains determines its cormectivity to the source 
node by comparing its state Xi{k) to the cut detection threshold e (a small positive number) as follows: 



cut_behef,(A:) = 



Xi{k) > € 

(2) 

1 Xiik) < € 



where cut_behef , = 1 means the node beheves it is discormected from the source and means it believes it is cormected 
to the source. 

The rationale for the algorithm comes from the interpretation of the states in terms of the potentials in an electrical 
circuit. If the network does not change with time, then the state of a node that is connected to the source converges 
to positive number that is equal to its electrical potential in a fictitious electrical network. If a node is disconnected 
from the source then its state converges to 0. When the network is time-varying, then too the states can be shown to 
converge in a mean-square sense to either positive numbers or depending on whether the node is connected or not 
(in some appropriate stochastic sense) to the source. We discuss the details of the electrical analogy and the theoretical 
performance guarantees that can be provided for the proposed algorithm in the next section. 

We note that the cut detection threshold e is a design parameter, and it has to be provided to all the nodes a-priori. 
The value of e chosen depends on the source strength s. Smaller the value of s, the smaller the value of e that has to be 
chosen to avoid false separation detection. We also note that the algorithm as described above assumes that all updates 
are done synchronously, or, in other words, every node share s the same iteration counter k. In practice, the algorithm is 
executed asynchronously without requiring any clock- synchronization or keeping a common time counter. To achieve 
this, every node keeps a bufl'er of the last received states of its neighbors. If a node does not receive messages from 
a neighbor during a time-out period, it updates its state using the last successfully received state from that neighbor. 
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Figure 2: A graph describing a sensor network (left), and the associated electrical network (right). In the electrical network, one node is chosen as 
the source that injects s Ampere cun'ent into the network, and additional nodes are introduced (fictitiously) that are grounded, through which the 
cun'ent flows out of the network. The thick line segments in the electrical network are resistors of 1 £1 resistance. 



When a node does not receive broadcasts from one of its neighbors for sufficiently long time, it removes that neighbor 
from its neighbor set and carries on executing the algorithm with the remaining neighbors. 

3. Algorithm Explanation and Theoretical Results 

When the graph does not change with time, i.e., Q(k) = Q for some 0, the state update law is an iterative method 
for computing the node potentials in a fictitious electrical network Q'^ = ('V, &'') that is constructed from the graph 
as follows. First, define "V" := V U {g} where g is a fictitious grounded node g and next, introduce n additional edges 
to connect each of the n node in 'V to the node g with a single edge. So the edges in consist of the edges in & and 
the newly introduced edges. Now an electrical network {Q" , 1) is imagined by assigning to every edge of a unit 
resistance. Figure|2]shows a physical network and the corresponding fictitious electrical network. It can be shown that 
in a time invariant network, the node states resulting from the state update law always converge to constants |^. In 
fact, the limiting value of a node state in a graph Q is the potential of the node in the fictitious electrical network Q'^ in 
which s Ampere current is injected at the source node and is extracted at a fictitious grounded node; which is always 
maintained at potential by virtue of being grounded (see Theorem 1 of |^). 

The evolution of the node states for a static network of nodes was analyzed in 1^, where it was shown that for a 
time invariant graph that is connected (and therefore every node is connected to the source), the state of every node 
converges to a positive number (see Theorem 1 of [8|). For nodes that are disconnected from the source, it was shown 
that their states converge to 0, and in fact this result holds even if the component(s) of nodes that are disconnected 
from the source are changing with time due to mobility etc. We state the precise result below: 

Theorem 1. Let the nodes of a sensor network with an initially connected undirected graph Q — {'V, &) execute 
the DSSD algorithm starting at time k — with initial condition Xi(0) — 0, i — 1 , . . . , \'V\. If a node v,- gets disconnected 
from the source at a time t > and stays disconnected for all future time, then its state x,(fe) converges to as k ^ oo. 

This result is useful in detecting disconnection from the source; if the state of node converges to (which can be 
determined by checking if it becomes smaller than a threshold), then the node detects that it is disconnected from the 
source. This partially explains the logic behind the Phase II of the algorithm. 



To detect connectivity to the source, we have to answer the question: how do the states of nodes that are intermit- 
tently connected to the source due to their own - or other nodes' - mobility evolve? In a mobile network the graph at 
any time instant is essentially random since it depends on which nodes are within range of which other nodes at that 
instant, which is difficult to predict in all but the most trivial situations. We therefore model the evolution of the graph 
G(k) as a stochastic process. In that case the state of every node also is a random variable, whose value depends on the 
evolution of the graphs. Assuming the evolution of the graph can be described by a Markov chain, we then show that 
the node states converge in the mean square sense. Meaning, the mean and variance of each node's state converge to 
specific values. We also provide formulas for these values. 

We consider the case when the sequence of graphs (^(^))^g can be modeled as the realization of a Markov chain, 
whose state space G := [Q\ , . . . , @n} is the set of graphs that can be formed by the mobile nodes due to their motion. 
The network at time k can be any one of the elements of the set G, i.e., Qik) e G. The Markovian property means that 
if ^ E G, then Pr(^(yt + 1) = g\0{k)) = Pr(^(fc + 1) = G\g{k),g{k - 1), . . . ,^(0)), where Pr(-) denotes probability. 
A simple example in which the time variation of the graphs satisfies the Markovian property is that of a network of 
static nodes with unreliable communication Unks such that each link can fail temporarily, and the failure of each edge 
at every time instant k is independent of the failures of other links and the probability of its failure is time-invariant. 
Another example is a network of mobile agents whose motion obeys first order dynamics with range-determined 
communication. Specifically, suppose the position of node v, at time k, denoted by pi{k), is restricted to lie on the 
unit sphere = {x € M-^IHxH - 1), and suppose the position evolution obeys: pi{k -t- 1) = fipiik) + A,(^)), where 
A,(fe) is a stationary zero-mean white noise sequence for every /, and E[A,(fe)Aj(^)^] = unless v, - Vj. The function 
/(■) : — » is a projection function onto the unit-sphere. In addition, suppose (y,, vj) e &(k) if and only if the 
geodesic distance between them is less than or equal to some predetermined range. In this case, prediction of Q{k +\) 
given Q{k) cannot be improved by the knowledge of the graphs observed prior to k: 0{k - 1), . . . , ^(0), and hence the 
change in the graph sequence satisfies the Markovian property. If no restriction is placed on the motion of the nodes 
or edge formation, the number of graphs in the set G is the total number of distinct graphs possible with n nodes. In 
that case, = 23"(""i\ where := |G|. If certain nodes are restricted to move only within certain geographic areas, 
A^ is less than this maximum number 

We assume that the Markov chain that governs the evolution of the graphs {G(A:)}^g is homogeneous, and denote 
the transition probabiUty matrix of the chain by V. The following result states under what conditions the node states 
converge in the mean square sense and when the Umiting mean values are positive. The proof of the result is provided in 
Section [3n In the statement of theorem, the state vector is the vector of all the node states: x(A:) :- {x\{k), . . . , Xn{k)Y , 
and the union graph @ :- U^^g^,- is the graph obtained by taking the union of all graphs in the set G, i.e., ^ = 

Theorem 2. When the temporal evolution of the graph @{k) is governed by a Markov chain that is ergodic, the state 
vector x(k) converges in the mean square sense. More precisely, for every initial condition x(0), there exists vector 
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(a) Qi 



(b) §2 



Figure 3: An example of two graphs Q\ and @i that appear at different times due to the motion of 4 mobile nodes. If these two graphs form the set 
of all possible graphs that can ever appear, then G = {Q\,@2\. Note that neither @i nor Qi is connected, but the union graph U,^, is. 

// e M" and a symmetric positive semi-definite matrix Q so that E[x(A:)] — » // and E[x(fe)x(^)"^] — > Q. Moreover, the 
vector yU is entry-wise non-negative. If P is entry-wise positive, we have > if and only if there is a path from 



Theorems |2] and [T] together explain the rationale behind Phase II of the DSSD algorithm. The state of a node 
converges to if and only if it is permanently disconnected from the source. If it is connected to the source in the 
union graph, meaning it is connected in a time-average sense (even if it is not connected in every time instant), then 
the expected value of its state converges to a positive number As a result, a node can detect if it is connected to the 
source or not simply by checking if its state has converged to 0, which is done by comparing the state to the threshold 



A closed-form expression for the limiting mean of the nodes states, i.e., the vector yU, and its correlation matrix Q, 
is also provided in Lemma [T] in Section [STI We refrain from stating them here as the expressions require significant 
amount of terminology to be introduced. The ergodic property of the Markov chain assumed in Theorem |2] ensures 
that the chain has a steady state distribution which is an entry-wise positive vector Intuitively, ergodicity of the chain 
means that every graph in the set G appears infinitely often as time progresses. In other words, the network Q{k) does 
not get stuck in one of the graphs in the set G. As a result, if a node is connected to the source in the union graph, even 
if it is not connected to the source in any one of the graphs that can ever appear, there is still a path for information 
flow between the source node and this node over time. Figure |3] shows a simple example in which the network Q{k) 
can be only one of the two graphs shown in the figure. Node 4 is disconnected from the source at every k, but it is 
connected in a time-average sense since information from the source node 1 can flow to 2 in one time instant when 
the graph 0\ occurs and then from 2 to 4 in another time instant when 02 occurs. In a such a situation the theorem 
guarantees that the expected value of the state of the node 4 will converge to positive number Thus, node 4 can detect 
that it is connected to the source in a time-average sense. On the other hand, if a node is not connected to the source in 
the union graph there is no path for information flow between itself and the source over all time. This is equivalent to 
the node being permanently disconnected from the source, so the result that the mean of the node's state converges to 
is consistent with the result of Theorem[T] The condition that f is entry-wise positive means that there is a non-zero 



node Vi to the source node v\ (with index 1) in the union graph Q. 
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probability that the network at time k can transition into any one of the possible graphs at time k+l, even though most 
of these probabilities maybe small. We believe this sufficient condition is merely an artifact of our proof technique, 
and in fact, this is not a necessary condition for convergence to occur. 

Remark 1. An advantage of the DSSD algorithm is that its convergence rate for a time-invariant network is indepen- 
dent of the number of agents in the network (Lemma 1 in f^]). This makes the algorithm scalable to large networks. 
The convergence rate of mean and variance of the nodes states to their limiting values in the mobile case, however, 
requires further research. 

3.1. Proof of Theorem^ 

We start with some notation, let D{k) be the degree matrix of Q{k), i.e., D{k) :- diag(t/i(fe), . . . , d„{k)). If node / 
fails at k(), we assign di{k) = and Niik) = (the empty set) for k > ko. Let A{k) be the adjacency matrix of the graph 
0{k), i.e., Aij{k) - I if (i, j) e S(k), and otherwise. With these matrices, ([TJ can be compactly written as: 

x(;t + 1) = (D(k) + (A(k)x(k) + i ei) (3) 

where ei = [1, 0, . . . , 0]^. Recall that the source node has been indexed as node vi. The above can be written as 

x(;t + 1) = J(k)xik) + B(k)w(k) (4) 

where the matrices J, B and the vector w are defined as 

J{k) -.^ {Dik) + T)-^ A{k), Bik) ^ {Dik) + rf\ w(A;):=sei. (5) 

Under the assumption that the temporal evolution of the graphs Qik) is governed by a Markov chain, we can write ([3} 
in the standard notation of jump-linear system ifisj 



x{k + 1) = Jg(k)^{k) + Bg(k)w(k) 



(6) 



where Jg^k) — J{k), Bg^k) - B(k), and w(k) - w - sei. This notation is used to emphasize that the state and input 
matrices J and B of the linear system (|6]l change randomly, and the transition is governed by a Markov chain. 

Let p{k) E[x(A;)], Qik) E[x(k)x(kf] be the mean and correlation of the state vector x(k), respectively. We 
need the following definitions and terminology from ifisl to provide expressions for these quantities as well as to state 
conditions for the convergence of the mean and correlation. Let W"^" be the space of m x « real matrices. Let H'"^" be 
the set of all N-sequences of real mxn matrices 7/ e M'"^". That is, if F e H'"^" then Y - (YuYz, . . . , Y^), where each 
Yj is an ;« X « matrix. The operators tp and (f are defined as follows: let (y,)^ e W" be the j-th column of F, e M'"^", 
then 



and 



0(Y) :-- 



(7) 



Hence, (fiY,) e R'"" and (p(Y) e M'^'"". Similarly, define an inverse function ^ 
element of H'"^" given a vector in M^'"". For X, e M"^" for / = 1 , . . . , A^, define 



-1 . jraNmn . TUfinxn 



diag[Xi] :- 



TiNnxNn 



that produces an 



(8) 



^0 ... Xm , 

For a set of square matrices C,j 6 R"'^™, hj- I, ... ,N, we also use the notation C - [Cij] to denote the following 
matrix: 



C = [Cij] :-- 



Cii Ci2 ... CiAi 

C21 C22 . . . C2N 



Cni Cn2 ■ ■ ■ Cnn 
In context of the jump linear system (|6|l, define the matrices 



C:= CP^ (E> I„)diag[Ji] eR^" 



r,NmxNm 



£) := (!P^ ® I„i)diag[Ji ® 7,] e 



(9) 



where /„ is the n xn identity matrix, !P is the transition probability matrix of the Markov chain and ® denotes Kronecker 
product. Note that the matrices C and D can be expressed, using the notation introduced above, as 



C = [PjiJj] 



D = [pjiFj] where Fi 7,- » 7;, 



(10) 



where pij is (/, j)-th entry of !P and ;r e M'^^ is the stationary distribution of the Markov chain, which exists due to the 
ergodicity directly followed by assumption of positive P. 

For a matrix X, we write X > to mean that X is entry-wise non-negative and write X > to mean X > and 
X Q. If every entry of X is positive, we write X » 0. For a matrix X, X > (>)0 means it is positive semi-definite 
(definite). For two matrices X and Y of compatible dimension, we write X > Y if Xjj > Ytj for all /, j, and write X > Y 
ifX>Y and X Y. For a vector x, we write x > to mean x is entry-wise non-negative, x > to mean x is entry- wise 
non-negative and at least one entry is positive, and x » to mean every entry of x is positive. The fact that both J 
and B are entry-wise non-negative will be useful later. 

We will also need the following technical results to prove Theorem|2l 



"'" be a block mxm matrix, where Cij are non-negative 
where \\ ■ \\ is either the induced 1-norm (\\ ■ \\\) or the 



Proposition 1 ([14], Theorem 3.2). Let C = [dj] e R" 
nxn matrices for alii, j — 1, . . . , m and let C — [||C,y||] € 
induced 00-norm ( \\ ■ \\co). Then p(C) < p(C). 

Proposition 2. Let Q be an undirected graph with n nodes, and let J — {D + /)~'A, where D and A are the degree and 
adjacency matrices, respectively. \\J\\oo < 1 and\\J®J\\oa < 1. 
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Proof. Based on examining the structure of 7, we see that Jij - — — j-l(A/,)(7)i where 1(ai(-^) is 1 if x e A and 
otherwise. Obviously, J is non-negative matrix and so is 7 (g) J. Meanwhile, each row sum of J is at most n - l/n, 
where n is the number of nodes. Therefore ||7||oo < ^ < 1, which leads to ||y ® J\\ca = ||/||oo||/||co < 1, where the 
equality is a result of properties of the Kronecker product ifisl Section 2.5]. □ 

Proposition 3. // the temporal evolution of the graph Q(k) is governed by a Markov chain that is ergodic, then we 
have p(C) < 1 and p{D) < 1, where C, D are defined in (|9]l. 

Proof. Since p{C) - pilpjiJj]) (see (fTOli). we obtain by applying Proposition[T]that 

p(C) < p([||py,./;|U]) = p([/?,7P,IU]), 

where the equality follows from pi/s being probabilities and therefore non-negative. Since ||7|| < 1 (Proposition |2]l, 
it follows that P^ > [p;,||/;Hco]. Since both P^ and [py/H/jHoo] are non-negative, and P^ is irreducible (which follows 
from the ergodic assumption of the Markov chain), it follows from Corollary 1.5 of IigI pg. 27] that p([;'j,||/;||oo]) < 
pCP^) = pif) - 1, the last equality being a property of a transition probability matrix. This proves that p(C) < 1. 
To show that p(D) < 1, since p(D) - pilpjiFj]) (see (fTOll). we obtain by applying Proposition[T]that 

piD)<p([\\pjiFj\u])^p([pji\\Fj\\^]), 

where the equality follows from pi/s being probabilities and therefore non-negative. Since the scalars ||Fj||oo satisfy 
||f jlU < 1 for each j (see Proposition |2]i, it follows that P^ > [pji\\Fj\\ao]. Since both P^ and [pj,||/^;||oo] are non- 
negative, and P^ is irreducible (which follows from the ergodic assumption of the Markov chain), it follows from 
Corollary 1.5 of [ 16, pg. 27] that p([pj,||f jlU]) < pCP^) - pCP) = 1, the last equality being a property of a transition 
probability matrix. This proves that p(D) < I. □ 

The proof of Theorem|2]will require the following result. 

Lemma 1. Consider the jump linear system ([6]) with the an underlying Markov chain that is ergodic. If p(D) < 1, 
where D is defined in then the state x(k) of the system ^ converges in the mean square sense, i.e., p(k) — > p and 
Q(k) — > Q, where ji and Q are given by 

N N 

p:=_^^« Q-.^YjQ- (11^ 



!=1 i=l 



where 
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where 

N 

(A := ■ ■ ■ , lA^]^ e R'^" and 'A - ^ PijBiWni e M" 

!=1 

A' 

R(q) -.^iRiiq),..., RN(q)) e H"^" and Rj(q) : = ^ pij{BiWW^ B] Uj + Jtq^'^w^B] + Biwq^'^^ jj)) e R"^", 

i=I 

one/ C /i defined in tt/ /i f/ze /-f/z enfry of the steady state distribution of the Markov chain, and Ji, Z?,- are the system 
matrices in (|6]l. Moreover, /u >0. 

Proof. The first statement about mean square convergence follows from standard results in jump linear systems, as do 
the expressions for the mean and correlation; see [13, Proposition 3.37]. Note that the existence of the steady state 
distribution n follows from the ergodicity of the Markov chain. 

To show that jj is entry-wise non-negative, note that since p{C) < 1 (Proposition |3]l, we have M :- {I - C)'^ = 
Thus, M>0 since C is non-negative (which follows from the fact that P > and 7, > O's). It follows from 
the expression for tfr that it is also non-negative vector. This shows that q - {I - Cy^if/ > 0, which implies // > 0. □ 

Now we are ready to prove Theorem |2] 

Proof of Theorem^ It follows from Proposition |3] that under the hypothesis of the Markov chain being ergodic, we 
have piD) < 1 . It then follows from Lemma[T]that the state converges in the mean square sense, which proves the first 
statement of the theorem. Note that the limiting mean and correlation of the state is also provided by Lemma[T] 

We already know from Lemma [1] that yu > 0. To prove the last statement of the theorem, that p.(u) > if and 
only if there is a path between node u and the source node 1 in the union graph, we have to look at the structure of 
the tall vector q in (fTTT i more carefully, since q completely determines /z. With some abuse of notation, from now on 
the source node will be referred to as node 1 instead of v'l. Note that ;7r » which follows from ergodicity, P » Q 
by assumption, B, is a diagonal matrix with positive diagonal entries for every / (follows from its definition), and 
w - sex, where s > Q and ex - [1,0,..., 0]^ E W . It is easy to show now that the ijj j - ae\ for some a > 0. Thus, 

oo 

i/r - a[e/, ...,ex'^Ye R'^". Since p(C) < 1 (Proposition©, M:-{I - C)"' = 2 C*. Now, we express the mati'ix M 
in terms of its blocks: Ai - [Al"-'-''], where Ai^'^^ are n x n matrices. Then, q can be rewritten as. 







M(ii) 


M<12) . 






aex 








;V((2i) 








aex 


= fl[M('^'ei], 














aex 





Therefore, q^'^ - 2jLi A1*'^*ei - AI*'/', where the subscript : 1 denotes the first column of the corresponding 
matrix. Hence, the M-th entry of q^'^ is q^'\u) - ^^Li ^ux- Recall that fi - Therefore fi(u) = if and only if 

^®(m) = for / = 1, . . . , A^, which is also equivalent to ^^li SjLi M'f = 0. 
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18L pp. 165]. Since M - C\ it follows from the preceding discussion that the 

/t=0 



The subsequent discussion requires introducing directed graphs associated with matrices. For every { x E matrix A, 
define Q{A) - CV, £) be the directed graph corresponding to A as follows: the node set V is the index set = {1, . . .,£] 
and the edge set is defined by e £ if and only if Aij + \VT\. It is a standard result in graph theory that the 
number of walks from a vertex / to vertex j in a directed graph of length r is the (/, 7')-th element of A'', where A is 
the adjacency matrix of the graph 
(/, y')-th entry of A1 is positive if and only if there exists a path from the vertex / to vertex j in the directed graph Q{C). 
Note that the graph Q(C) contains nodes. We can group Nn nodes into clusters such that each cluster, containing 
n nodes, can be thought of as copies of the n nodes in the sensor and robot network. To prevent confusion between the 
vertices in Q{C) and node set of the original network, we use v*'' to denote a node in the graph Q{C) that is the /-th 
copy of the node v in 'V, where / = 1 , . . . , A^. 

Therefore Zi^i ^I'f = is equivalent to there being no directed path from any of the m's copies (m*'', / = 
1, ... ,A^) to any of I's copies (1*'', / = 1, ... ,A^) in the directed graph Q{C). Otherwise, q{u) > 0. Since existence of 
an edge from / to j in ff{A) only depends on whether the /, j-th entry of A is non-zero, and does not depend on the 
specific value of the entry, it is convenient to define A be a matrix associated with the matrix A, such that Ajj = 1 if 
Aij + and A,j = if A,y = 0. Since V » 0, we have 



Jn 
Jn 

Jn 



It can be seen in a straightforward manner upon examining the matrix C that if there is an edge between nodes u and 
V in the /-th graph i.e., (u, v) e S'-'\ then (u'^j\ v*'') for all ; = 1, . . . , A^, and (v'-j\ m®) for all j - l,...,N, i.e., there 
are edges in ^(C) from all copies of u to v*^'', the /-th copy of v, and from all copies of v to u^'\ the /-th copy of u. 

Now we will show that if an arbitrary node u is connected to 1 in the union graph U^^j^,, then there is a path from a 
copy of M to a copy of 1 in the directed graph ^(C), otherwise not. To see that this is the case, we first take an example: 
consider a path of length 2 from u to 1 in the union graph that involves two edges in two distinct graphs: (m, v) e fi*^' 
and (v, 1) e From the preceding discussion, we have that (v, 1) € £"^ => (v''*, 1*''), (v*^*, l"') e £(C), and (u, v) e 
£(2) ^ (m"), v(2)), (m(2), v(2)) 6 S(C). Thus a path from a copy of m to a copy of 1 in g(C) is p ^ v*^)), (v(2), l^))). 
This argument works as long as there is a path from u to 1 in the union graph, irrespective of how long the path is. 
This shows that u is connected to 1 in the union graph, then there is a path from at least one of its copies to one of 1 's 
copies in the directed graph 0(0), which means q{u) > 0. If, however, u is not connected to 1 in the union graph, we 
can show that there is no path from any of m's copies to any of I's copies. This can be shown by considering the set 
of all nodes that do not have paths to 1 in the union graph and the set of nodes that do separately; see [(19I for details. 
This concludes the proof of the last statement of the theorem. □ 
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(a) Q before cut 



(b) g(k)fork> 100 



(c) Xu(,k) and Xy(k) vs. k 



Figure 4: (a)-(b): A sensor network with 200 static nodes, shown before and after a cut occurs due to the failure of the nodes shown 
as red squares. The cut occurs atk= 100. (c): The states of two nodes u and v as a function of iteration number. The source node 
is at the center (triangle), and the source strength is chosen as i = 5 x 10*. 

4. Simulation Tests 

The DSSD algorithm was tested in a MATLAB™ simulation for a network consisting of 200 agents initially 
deployed in a unit square at random. Two agents can only establish direct communication if their Euclidean distance is 
less than 0.11. The source strength and cut detection threshold was = 5 x 10^ and e - 10"^, respectively. Since there 
is no existing prior work on the problem of detecting separation in mobile networks that can operate without multi-hop 
routing, we do not provide simulation comparison with existing algorithms. Note that the solutions proposed in 1^ 
require routing between the nodes and the base station, which is challenging in sensor and robotic networks in which 
the topology can change with time quickly. 

4.1. Performance of DSSD in a static network 

The first set of simulations is conducted with 200 static nodes (see Figure|4la)). The center node (symbolized by 
a triangle) is the source node. Simulations are run in a synchronous manner and a neighbor is removed from the list 
of neighbors of a node the first time it failed to receive messages from that neighbor At A: - 100 the nodes shown 
as red squares in Figure HJb) fail, leading to a cut in the network. Figure |5c-d) show the time evolution of the states 
(calculated using ([T]i) of the four nodes u, v, w, and z- Node v is the only one among the four that is separated from 
the source after the cut occurs. Initially, the states of every node increase from and then settle down to their steady 
state value. After the cut occurs, the state of node v decreases towards 0. When the state of node v decreases below the 
preset threshold e, it declares itself cut from the source. This occurs at A; = 133, thus the delay between the occurrence 
of the cut and its detection by v is 33 time-steps. 

4.2. Performance of DSSD in a mobile network 

Figures |3a-d) show four snapshots of a communication network of 200 mobile agents. The agents are divided 
into two groups, though there is no clear spatial separation between the two groups initially. The position of agent u, 

14 



(a) gatk = 24 



(b) g(k) for k = 29 




Figure 5: Four snapshots of a network of 200 mobile agents. 



ZJk+l)^Z,(k) + 



(12) 



denoted by Z„, is updated according to: 

where 6Z„x{k), dZuy(k), for every u and k, are independent random numbers. For agents in the first group, both (JZ„ 
and 6Zuy are normally distributed with mean 0.003 and variance 0.0003. For the second group, 5Z„i, ^Z,,,, are normally 
distributed with mean -0.003 and variance 0.0003. The motion of the agents results in the network composed of two 
disjoint components atk- 28, four components at A: = 56, and then again two components at = 80. 

The evolution of the states of four agents /, j, p, and q are shown in Figure |6ja-b). The loss of connectivity of 
agent q from the source occurs at = 28 and is detected at A: = 55. Connectivity to the source is regained at = 80 and 
is detected alk - 81 (when the states became greater than e). These simulations provide evidence that the algorithm 
is indeed effective in detecting disconnections and re-connections, irrespective of whether the network is made up of 
static or mobile agents. 



5. System Implementation and Experimental Evaluation 

In this section we describe the implementation, deployment and performance evaluation of a separation detection 
system for robotic sensor networks based on the DSSD algorithm. We implemented the system, using the nesC 
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Figure 7: Partial view of the 24 node outdoor system deployment. 



language, on Berkeley motes |20] running the TinyOS operating system 121I1 . The code uses 16KB of program 
memory and 719B of RAM. The separation detection system executes in two phases: Reliable Neighbor Discovery, 
and the DSSD algorithm. 

In the Reliable Neighbor Discovery Phase each node broadcasts a set of beacons in a small, fixed, time interval. 
Upon receiving a beacon from node v, , a node updates the number of beacons received from node v, . Next, an iteration 
of the DSSD algorithm executes. To determine whether a communication Unk is established, each node first computes 
for each of its neighbors the Packet Reception Ratio (PRR), defined as the ratio of the number of successfully received 
beacons received from, to the total number of beacons sent by, a neighbor A neighbor is deemed reliable if the 
PRR > 0.8. After receiving state information from neighbors, a node updates its state according to Equation ([T]i 
and broadcasts its new state. When broadcast from a neighbor is not received for 2 iterations, the last reported state 
of the neighbor is used for calculating the state. A neighbor from which broadcast is not received for 4 iterations 
is permanently removed from the neighbor table. The state is stored in the 512KB on-board flash memory at each 
iteration (for a total of about 1.6KB for 200 iterations) for post-deployment analysis. In order to monitor connectivity 
information each node broadcasts its neighbor table along with the state. 
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Cut 




(a) The graph (b) State histories 

Figure 8: (a) The network topology during the outdoor deployment, (b) The states of nodes u and v (as labeled in (a)), which are connected and 
disconnected, respectively, from the source after the cut has occurred. 




(a) (b) 

Figure 9: (a) Test set-up for mobile network experiments. The human agents are not shown, (b) Mobile node consisting of a Berkeley mote on a 
Roomba robot. 

To ensure a lock-step execution of the algorithm, all nodes are started at approximately the same time. For this, 
a mote acting as a base station, connected to a laptop, broadcasts a "system start" message, which is resent by each 
sensor node at most once. The base station is also used for monitoring the execution of the algorithm and monitoring 
the inter-mote communication. 

5.7. Experimental Performance Evaluation in Static Network 

For evaluating the performance of our separation detection system in static networks, we deployed a network of 
24 motes in a 13x5m^ outdoor field at Texas A&M University. Because the motes were positioned on the ground the 
radio range was reduced considerably with a one-hop distance of about 1 .5m. The network connectivity is depicted 
in FigureOa). A partial view of the outdoor deployment is shown in Figure|7] 

In our deployment, the source strength was specified as s = 100, the iteration length was 5sec (this value could 
be reduced easily to as small as 200 msec) and the cut detection threshold was e = 0.01. Experimental results for 
two of the sensor nodes deployed are shown in Figure |8] After about 30 iterations the states of all nodes converged. 
At iteration A: = 83 a cut is created by turning off motes inside the rectangle labeled "Cut" in Figure |8l a). Figures 
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Figure 10: Four snapshots of a network of 8 mobile agents and the state evolution for them resulting from the DSSD algorithm. The dashed lines 
represent communication links. 



|8tb) and[8lc) show the states for nodes u and v, as depicted in Figure jSja), which were connected and disconnected, 
respectively, from the source node after the cut. The evolution of their states follows the aforementioned experimental 
scenario. Node v declares itself cut from the source at A: = 100, since its state falls below the threshold 0.01 at that 
time. 



5.2. Experimental Performance Evaluation in Mobile Network 

For evaluating the performance of our separation detection system in mobile networks we deployed 8 sensor nodes 
in an indoor environment, with 4 of the nodes residing on Roomba robots and 4 on human subjects. The scenario 
we emulated was that of a robotic-assisted emergency response team. Figure |9] shows part of the test set-up with the 
mobile nodes. 

The network topologies as well as the locations of the nodes at a few time instants are shown in Figure [10] As we 
can see from the figure, the topology of the network varied greatly over time due to the mobility of the nodes. 

Figure [TT] shows the time-traces of the node states during the experiment. The network is connected until time 
k - 120, and the states of all nodes converges to positive numbers; see Figure fTTT a). This is consistent with the 
prediction of theorem|2] At approximately iteration k - 120, four of the nodes (nodes 5 through 8), carried by human 
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Figure 11: The states of nodes 1 through 8 in the mobile network experiment with Roomba robots and human agents. 



subjects, are disconnected from the rest of the network, and in particular, from the source node 1. A sample network 
topology during the time interval k - 120 to A: = 170 is shown in Figure [TOl b). As we can see from Figure [TTT b-c). the 
states of the disconnected nodes 5 through 8 converge to zero. The nodes 5, 6, 7, 8 detect that they are separated from 
the source, at times k - 145, 145, 143, 143 respectively, when their states become lower than the threshold e = 0.01. 
At approximately iteration k - 170, node 5 joins back the sub-network formed by nodes 1-4. As a result of node 
5 moving, node 7 becomes a bridge between the two sub-networks. Hence, after iteration k - 170, the states of 
nodes 6, 7 and 8 become positive (hence, a fully connected network). However, this re-connection is temporary, and 
nodes 6 through 8 again become disconnected from the source after some time, which is seen in their states. Another 
temporary connection occurs between the set of nodes 6-8 and the set of nodes 1-5, during the time interval A: = 180 
through k = 210, followed by a separation. Finally, after iteration k = 260, the network becomes connected again, as 
shown in Figure [TOl d). As a result, the states of all the nodes become positive after time k = 225, and they detect their 
re-connections to the source. 



6. Conclusions 

In this paper we introduced the Distributed Source Separation Detection (DSSD) algorithm to detect network 
separation in robotic and sensor networks. Simulations and hardware experiments demonstrated the efficacy of the 
algorithm. DSSD requires communication only between neighbors, which avoids the need for routing, making it 
particularly suitable for mobile networks. The algorithm is distributed, doesn't require time synchronization among 
nodes, and the computations involved are simple. The DSSD algorithm is applicable to a heterogeneous network of 
static as well as mobile nodes, with varying levels of resources, precisely the kind envisioned for robotic and sensor 
networks. 
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